Publications
Citation
Rasko, D. A., Rosovitz, M. J., Myers, G. S., Mongodin, E. F., Fricke, W. F., Gajer, P., Crabtree, J., Sperandio, V., Ravel, J.
The Pan-genome Structure of Escherichia Coli: Comparative Genomic Analysis of E. coli Commensal and Pathogenic Isolates
J Bacteriol. 2008 Aug 01; 190(20): 6881-93.
Abstract
Whole genome sequencing has been skewed towards bacterial pathogens as a consequence of the prioritization of medical and veterinary diseases. However, it is becoming clear that in order to accurately measure genetic variation within and between pathogenic groups, multiple isolates, as well as commensal species must be sequenced. This study examines the pan-genomic content of E. coli. Six distinct E. coli pathovars can be distinguished using molecular or phenotypic markers, but only two of the six pathovars had been subjected to any genome sequencing. As such, this report is the seminal description of the genomic content and unique features of three unsequenced pathovars, ETEC, EPEC and EAEC. We have also determined the first genome sequence of a human commensal E. coli isolate, E. coli HS, which will undoubtedly provide a new baseline from which one can examine the evolution of pathogenic E. coli. Comparison of 17 E. coli genomes, eight of which are new, identified approximately 2200 genes conserved in all isolates. We were also able to identify genes that were isolate- and pathovar-specific. Fewer pathovar-specific genes were identified than anticipated, suggesting that each isolate may have independently developed virulence capabilities. Pan-genome calculations indicate that E. coli genomic diversity represents an open pan-genome model containing a reservoir of greater than 13000 genes, many of which may be uncharacterized but important virulence factors. This comparative study of the E. coli species, while descriptive in nature, will provide the basis for future functional work on this important group of pathogens.